Sparse Partially Linear Additive Models
نویسندگان
چکیده
The generalized partially linear additive model (GPLAM) is a flexible and interpretable approach to building predictive models. It combines features in an additive manner, allowing them to have either a linear or nonlinear effect on the response. However, the assignment of features to the linear and nonlinear groups is typically assumed known. Thus, to make a GPLAM a viable approach in situations in which little is known a priori about the features, one must overcome two primary model selection challenges: deciding which features to include in the model and determining which features to treat nonlinearly. We introduce sparse partially linear additive models (SPLAMs), which combine model fitting and both of these model selection challenges into a single convex optimization problem. SPLAM provides a bridge between the Lasso and sparse additive models. Through a statistical oracle inequality and thorough simulation, we demonstrate that SPLAM can outperform other methods across a broad spectrum of statistical regimes, including the high-dimensional (p N) setting. We develop efficient algorithms that are applied to real data sets with half a million samples and over 45,000 features with excellent predictive performance.
منابع مشابه
Estimation of a groupwise additive multiple-index model and its applications
In this paper, we propose a simple linear least squares framework to deal with estimation and selection for a groupwise additive multiple-index model, of which the partially linear single-index model is a special case, and in which each component function has a single-index structure. We show that, somewhat unexpectedly, all index vectors can be recovered through a single least squares coeffici...
متن کاملOn High Dimensional Post-Regularization Prediction Intervals
This paper considers the construction of prediction intervals for future observations in high dimensional regression models. We propose a new approach to evaluate the uncertainty for estimating the mean parameter based on the widely-used penalization/regularization methods. The proposed method is then applied to construct prediction intervals for sparse linear models as well as sparse additive ...
متن کاملSparse Regularization for High Dimensional Additive Models
We study the behavior of the l1 type of regularization for high dimensional additive models. Our results suggest remarkable similarities and differences between linear regression and additive models in high dimensional settings. In particular, our analysis indicates that, unlike in linear regression, l1 regularization does not yield optimal estimation for additive models of high dimensionality....
متن کاملRobust Estimation in Linear Regression with Molticollinearity and Sparse Models
One of the factors affecting the statistical analysis of the data is the presence of outliers. The methods which are not affected by the outliers are called robust methods. Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers. Besides outliers, the linear dependency of regressor variables, which is called multicollinearity...
متن کاملStructured Sparse Additive Models
1.1 Parametric models: Linear Regression with non-linear basis functions Although the linear regression with linear basis is widely used in different areas, it is not powerful enough for lots of the real world cases as not all the models are linear in the real world. However, we can use non-linear basis functions to deal with non-linear relationships. It is just a linear combination of some fun...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1407.4729 شماره
صفحات -
تاریخ انتشار 2014